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I® 

Nordic Games GmbH nOrOIC Q3mGS 

•Started in 2011 as a sister company to Nordic Games Publishing 
(We Sing) 

•Base IP acquired from JoWooD and DreamCatcher (SpellForce, The 
Guild, Aquanox, Painkiller) 

• Initially focusing on smaller, niche games 

•Acquired THQ IPs in 2013 (Darksiders, Titan Quest, Red Faction, MX 
vs. ATV) 

• Now shifting towards being a production company with internal devs 

•Since fall 2013: internal studio in Munich, Germany (Grimlore 
Games) 
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Leszek Godlewski 
Programmer, Nordic Games 


nordk games 


• Ports 

• Painkiller Hell & Damnation (The Farm 51) 

• Deadfall Adventures (The Farm 51) 

• Darksiders (Nordic Games) 

• Formerly generalist programmer on PKHD & DA at TF51 







Darksiders} 
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Objective of this talk 

Your game engine on Linux, before porting: 



Missing! 
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Objective of this talk (cont.) 

Your first "working" Linux port: 



Oops. Bat-Signal! 
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Objective of this talk (cont.) 


Where I want to try helping you get to: 
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In other words, from this: 


FPS: 30 Min ms: 33 Avg ms; 33 Max ms; 33 • 

System: 182.6/512.0 
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To this: 
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And that's mostly debugging 

All sorts of debuggers! 


o 
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Demo code available 

is.gd/GDCE14Linux 
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Intended takeaway 

• Build system improvements 

• Signal handlers 

• Memory debugging with Valgrind 

• OpenGL debugging techniques 
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Int e nd e d tak e away Agenda 

• Build system improvements 

• Signal handlers 

• Memory debugging with Valgrind 

• OpenGL debugging techniques 
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Build systems 

What I had initially with UE3: 

• Copy/paste of the Mac OS X toolchain 

• It worked, but... 

• Slow 

• Huge binaries because of debug symbols 

• Problematic linking of circular dependencies 
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Build systems (cont.) 

• 32-bit binaries required for 
feature/hardware parity with Windows 

• Original solution: a chroot jail with an 
entire 32-bit Ubuntu system just for 
building 
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Cross-compiling for 32/64-bit 

• gcc -m32/-m64 is not enough! 

• Only sets target code generation 

• Not headers & libraries (CRT, OpenMP, libgcc etc.) 

• Fixed by installing gcc-multilib 

• Dependency package for non-default architectures 
(i.e. i386 on an amd64 system and vice versa) 
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Clang (ad nauseam) 

• Clang is faster 

• gcc: 3m47s 

• Clang: 3m05s 


• More benchmarks at Phoronix [LARABEL13] 

• Clang has different diagnostics than gcc 


test.cpp; In function 'int maindnt, char**]'; 

test.cpp:8;2: error: unknown type name 'integer'; did you mean 'Integer'? 

test,cpp;8;2; error; 'integer' was not declared 

integer i = 0; 

integer i = G; 

A 

Integ=i 


test.cpp:4:13: note: 'Integer' declared here 

test,cpp;8;lQ; error; expected ; before i 

typedef int Integer; 

integer i = Q; 

A 

A 

1 error lenerated, 
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Clang (cont.) 

• Preprocessor macro compatibility 

• Declares GNUC etc. 

• Command line compatibility 

• Easily switch back & forth between Clang & gcc 
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Clang - caveats 

• C++ object files may be incompatible with 
gcc &. fail to link (need full rebuilds) 

• Clang is not as mature as gcc 

• Occasionally has generated faulty code for me 
(YMMV) 
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Clang - caveats (cont.) 

• Slight inconsistencies in C++ standard 
strictness 

• Templates 

• Anonymous structs/unions 

• May need to add this-> in some places 

• May need to name some anonymous types 
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So: Clang or gcc? 

Both: 

• Clang - quick iterations during 
development 

• gcc - final shipping binaries 
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Unking - GNU Id 

• Default linker on Linux 

• Ancient 

• Single-threaded 

• Requires specification of libraries in the order 
of reverse dependency... 

• We are not doomed to use it! 
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Unking - GNU gold 

• Multi-threaded linker for ELF binaries 

• Id: 18s 

• gold: 5s 

• Developed at Google, now officially part of 
GNU binutils 


22 



GAME DEVELOPERS CONFERENCE’" EUROPE 2014 AUGUST 11-13, 2014 


CDCEUR0PE.C0M 


Unking - GNU gold (cont.) 

• Drop-in replacement for Id 

• May need an additional parameter or toolchain 
setup 

• clang++ -B/usr/lib/gold-ld . . . 

• g++ -fuse-ld=gold . . . 

• still needs libs in the order of reverse 
dependency... 
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Unking - reverse dependency 

• Major headache/game-breaker with 
circular dependencies 

• "Proper" fix: re-specify the same libraries 
over and over again 

• gcc app.o -lA -IB -lA 
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Unking - reverse dep. (cent.) 
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Unking - reverse dep. (cent.) 
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Unking - reverse dep. (cent.) 


app 






'll 
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Unking - reverse dep. (cent.) 







Just the missing 
symbols 
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Unking - library groups 

• Declare library groups instead 

• Wrap library list with —start-group, --end- 
group 

• Shorthand: 

• g++ foo.obj -Wl,-\( -lA -IB -Wl,-\) 

• Results in exhaustive search for symbols 
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Unking - library groups (cont.) 

• Actually used for non-library objects (TUs) 

• Caveat: the exhaustive search! 

• Manual warns of possible performance hit 

• Not observed here, but keep that in mind! 
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Running the binary in debugger 

inequation@spearhead : ~/projects/largebinary$ gdb -- 
silent largebinary 

Reading symbols from /home/inequation/projects/larg 
ebinary/largebinary . . . 

[zzz... several minutes later...] 
done . 

(gdb) 
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Caching the gdb-index 

• Large codebases generate heavy debug 
symbols (hundreds of MBs) 

• GDB does symbol indexing at every 
single startup © 

• Massive waste of time! 
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Caching the gdb-index (cont.) 

• Solution: fold indexing into the build 
process 

• Old linkers: as described in [GNUOl] 

• New linkers (i.e. gold): --gdb-index 

• May need to forward from compiler driver: 
-W1 , - -gdb-index 
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Agenda 

• Build system improvements 

• Signal handlers 

• Memory debugging with Valgrind 

• OpenGL debugging techniques 
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Signal handlers 

• Unix signals are async notifications 

• Sources can be: 

• the process itself 

• another process 


• user 

• kernel 
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Signal handlers (cont.) 

• A lot like interrupts 
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Signal handlers (cont.) 

• System installs a default handler 

• Usually terminates and/or dumps core 

• Core ~ minidump in Windows parlance, but entire 
mapped address range is dumped (truncated to 
RLIMIT_CORE bytes) 

• See signal(7) for default actions 
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Signal handlers (cont.) 

• Can (should!) specify custom handlers 

• Get/set handlers via sigaction(2) 

• void handler(int, siginfo_t *, void *); 

• Needs SA_SIGINFO flag in sigaction() call 

• Extensively covered in [BENYOSSEF08] 
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Interesting siginfo_t fields 

• si_code - reason for sending the signal 

• Examples: signal source, FP over/underflow, 
memory permissions, unmapped address 

• si_addr - memory location (if relevant) 

• SIGILL, SIGFPE, SIGSEGV, SIGBUS and 
SIGTRAP 
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Interesting signals 

• Worth catching 

• SIGSEGV, SIGILL, SIGHUP, SIGQUIT, SIGTRAP, 
SIGIOT, SIGBUS, SIGFPE, SIGTERM, SIGINT 

• Worth ignoring 

• signal(signum, SIG_IGN); 

• SIGCHLD, SIGPIPE 
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Signal handling caveats 

• Prone to race conditions 
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Signal handling caveats (cont.) 

• Prone to race conditions 

• Can't share locks with the main program 


Normal 

flow 


Lock 

mutex 




Signal 

handler 


Deadlock ® 
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Signal handling caveats (cont.) 

• Prone to race conditions 

• Can't call async-unsafe/non-reentrant 
functions 

• See signal(7) for a list of safe ones 

• Notable functions not on the list: 

• printfO and friends (formatted output) 

• mallocO and free() 


43 



GAME DEVELOPERS CONFERENCE’" EUROPE 2014 AUGUST 11-13, 2014 


CDCEUR0PE.C0M 


Signal handling caveats (cont.) 

• Not safe to allocate or free heap memory 


aa allocated 
chuak 


a freed 
chunk 


sKe/stams=inuse 


... user data space ... 


pointer to next chunk in bin 


an allocated 
chunk 


other^hun^ 

wilderness 

chunk 




groMP^ 
jpn^p”' 
Romped 

STOMPED^ 

STOMPED? 

STOMPED? 

STOMPEO? 

STOMPED? 

STOMPED?' 

STOMPED? 

STOMPED? 


iv'ious chunk in bin 


e nd of availab le mern ory 


Source: [LEAOl] 
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Signal handling caveats (cont.) 

• Custom handlers do not dump core 

• At handler installation time: 

• Raise RLIMIT_CORE to desired core size 

• Inside handler, after custom logging: 

• Restore default handler using signal(2) or 
sigaction(2) 

• raise(signum) ; 
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Safe stack walking 

• glibc provides backtrace(3) and friends 

• Symbols are read from the dynamic 
symbol table 

• Pass -rdynamic at compile-time to populate 
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Safe stack walking (cont.) 

• backtrace_symbols( ) internally calls 
malloc( ) 

• Not safe... © 

• Still, can get away with it most of the time 
(YMMV) 
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Example "proper" solution 

• Fork a watchdog process in main() 

• Communicate over a FIFO pipe 

• In signal handler: 

• Collect & send information down the pipe 

• backtrace_symbols_fd( ) down the pipe 

• Demo code: is.gd/GDCE14Linux 
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Agenda 

• Build system improvements 

• Signal handlers 

• Memory debugging with Valgrind 

• OpenGL debugging techniques 
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Is this even related to porting? 

• Yes! Portability bugs easily overlooked 

• Hardcoded struct sizes/offsets 

• OpenGL buffers 

• Incorrect binary packing/unpacking 

• "How did we/they manage to ship that?!" 
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What is Valgrind? 

• Framework for dynamic, runtime analysis 

• Dynamic recompilation 

• machine code ^ IR ^ tool ^ machine code 

• Performance typically at 25-20% of unmodified 
code 

• Worse if heavily threaded - execution is serialized 
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What is Valgrind? (cont.) 

• Many tools in it: 

• Memory error detectors (Memcheck, 
SGcheck) 

• Cache profilers (Cachegrind, Callgrind) 

• Thread error detectors (Helgrind, DRD) 

• Heap profilers (Massif, DHAT) 
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Memcheck basics 

• Basic usage extremely simple 

• ...as long as you use the vanilla libc mallocO 

• valgrind ./app 

• Will probably report a ton of errors on the 
first run! 

• Again: "How did they manage to ship that?!" 
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Memcheck basics (cont.) 

• Many false positives, esp. in parties 

• Xlib, NVIDIA driver 

• Can suppress them via suppress files 

• Call Valgrind with --gen-suppressions=yes to 
generate suppression definitions 

• Be careful with that! Can let OpenGL bugs slip! 
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Contrived example 

#include <stdlib.h> 
int main(int argc, char *argv[]) { 
int too, *ptr1 = &foo; 
int *ptr2 = malloc(sizeof (int)) ; 
if (*ptr1) 

ptr2[1] = Oxabadidea; 

else 

ptr2[1] = 0x15bad700; 

ptr2[0] = ptr2[2]; Demo code: 

return *ptr1; is.gd/GDCE14Linux 


} 
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Valgrind output for such 

==8925== Conditional jump or move depends on 
uninitialised value(s) 

==8925== Invalid write of size 4 
==8925== Invalid read of size 4 
==8925== Syscall param exit_group(status) 
contains uninitialised byte(s) 

==8925== LEAK SUMMARY: 

==8925== definitely lost: 4 bytes in 1 blocks 
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What about custom allocators? 

• Custom memory pool & allocation algo 

• Valghnd only "sees" mmap( )/munmap( ) of 
multiples of entire memory pages 

• All access within those pages - now valid! 

• How to track errors? 
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Client requests 

• Allow annotation of custom allocators 

• ~20 C macros defined in valgrind.h 

• Common and per-tool requests exist 

• Can be cut out with -DNVALGRIND 

• Detailed description in [VALGRINDOl] 
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Example: Instrumenting dimalloc 

• 2.8.4 instrumentation from [CRYSTALOl] 

• Demo code: is.gd/GDCE14Linux 

• Compile the sample with -DDLMALLOC 

• Similar results to libc malloc( ) 
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Other uses of client requests 

• Pointer validation 

• Is address mapped? Is it defined? 

• Mid-session leak checks 

• Level transitions 
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Other uses of client req. (cont.) 

• Poisoning memory regions 

• Ensuring signal handlers don't touch the heap 

• Ensuring geometry buffers aren't read on CPU 


m allocated size/statu^inuse 
chunk 


a freed 
chunk 


an allocated 
chunk 


other chunks 


wilderness 

chunk 


... user data space 


size/statu^free 
p ointer to next chunk in t 


pointer to previous chunk 


...unused space... 


size/stam^inusc 


user data 


eSams^SI 



end of available memory 


Source: [LEAOl] 
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Debugging inside Valgrind 

• A gdbserver for "remote" debugging 

• SIGTRAP (breakpoint) on every error 

• Unlimited memory watchpoints! 

• Data breakpoints in Visual Studio parlance 

• Cf. 4 single-word hardware debug registers on 
x86 
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Debugging inside Valgrind (cent.) 

• Terminal A: 

• valgrind --vgdb=yes --vgdb-error=0 
. /MyGame 

• Terminal B: 

• gdb ./MyGame 

• target remote | vgdb 
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Agenda 

• Build system improvements 

• Signal handlers 

• Memory debugging with Valgrind 

• OpenGL debugging techniques 
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Ye Olde Way 

• Call glGetErrorO after each OpenGL call 

• Get 1 of 8 (sic!) error codes 

• Look up the call in the manual 

• See what this particular error means in 
this particular context... 
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Ye Olde Way (cont.) 

• ...Then check what was actually the case 

• 6 possible reasons for GL_INVALID_VALUE in 
glTexImage*( ) alone! See [OPENGLOl] 

• Usually: attach a debugger, replay the 
scenario... 

• This sucks! 


66 



GAME DEVELOPERS CONFERENCE’" EUROPE 2014 AUGUST 11-13, 2014 


CDCEUR0PE.C0M 


Ye Olde Way (cont.) 

• ...Then check what was actually the case 

• 6 possible reasons for GL_INVALID_VALUE in 
glTexImage*( ) alone! See [OPENGLOl] 

• Usually: attach a debugger, replay the 
scenario... 

• This sucks! used to suck © 
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Debug callback 

• Never call glGetError() again! 

• Much more detailed information 

• Incl. performance tips from the driver 

• Good to check what different drivers say 

• May not work without a debug OpenGL 
context (GLX_CONTEXT_DEBUG_BIT_ARB) 
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Debug callback (cont.) 


• Provided by either of (ABI-compatible): 
GL_KHR_debug [OPENGL02], 
GL_ARB_debug_output [OPENGL03] 



OpenGL 

OpenGL 

ES 

NVIDIA 

(official) 

AMD 

(official) 

Intel 

(Mesa) 

AMD 

(Mesa) 

ARB 

_debug 

_output 

V 

X 

V 

V 

V 

V 

KHR 

_debug 

V 

V 

V 

V 

X 

X 
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Debug callback (cont.) 

void callback(GLenum source,^ 

GLenum type, 

GLuint id, 

GLenum severity, 
GLsizei length, 
const GLchar* message. 


Filter by 
source, type, 
severity or 
individual 
messages 


const void* userParam); 


70 



GAME DEVELOPERS CONFERENCE’" EUROPE 2014 AUGUST 11-13, 2014 


CDCEUR0PE.C0M 


Debug callback (cont.) 

• Verbosity can be controlled (filtering) 

• glDebugMessageControl[ARB] ( ) 

• [OPENGL02][OPENGL03] 

• Turn to 11 (GL_DONT_CARE) for valuable 
perf information! 

• Memory type for buffers, unused mip levels... 
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API call tracing 

• Record a trace of the run of the application 

• Replay and review the trace 

• Look up OpenGL state at a particular call 

• Inspect state variables, resources and objects: 
textures, shaders, buffers... 

• apitrace or VOGL 
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Well, this is 


not helpful. 

merncpy[0xl061f800, [binary data, size = 5.76562 kb], 5904) 

9lUnmapNamedBuffarE)CT(261) - GL_TRUE 
meincpyt0xl061edc0, [binary data, sizi = 396 bytes], 396] 
glUnmapBuffer(GL ELEMENT_ARRAY BUFFER) = GL_TRUE 
glGenBuffersd, [263]) 

glNamedBufferDataEXT(263. 1356, NULL GL_STATIC_DRAW) 
g!MapNamedBufferE>CT<263, GL_WRITE_ONiy] - 0x10621340 
memcpyt0xl0621340. [binary data, size - 1.32422 kb], 1356] 
glUnmapNamedBufr«rE)a(263) = GL TRUE 
glGenBuffersd, [264]) 

glNamedBufferDataEXT(264, 5424, NUIX, GL_STATIC_DRAW) 
glMapNamedBufferEXr(264, QL_WRITE_ONiy> = 0xl0621f00 
memcpy[0xl0621f00, [binary data, size - 5.29688 kb], 5424) 
g1UnmapNamedBufferEXT(264) - GL TRUE 
glGenBuffersd. [265]) 

glBindBuffer(GL_ELEMENT ARRAY_BUFFER, 265) 
glBufferOataCGL ELEMEMfARRAY.BUFFER, 624, NUIX, GL STATIC DRAW) 
glMapNamedBufferEXr(263, GL_READ WRITE) = 0x10621340 
glMapNamedBufferEXr(264. GL_READ"WRITE) - 0xl0621f00 

giMapBufferRangelGL ELEMENT.ARBAY BUFFER, 0, 624, GL MAP READ.BFT ] GL MAP.WRrTE.BIT) - 0x10623480 

memcpy [0x1 0621340. [binary data, size = 1.32422 kb]. 1356] 

g1UnmapNamedBufferEXT(263) - GL_TRUE 

memcpy(0xl0621f00, [binary data, sZe = 5.29688 kb], 5424) 

glUnmapNam«dBuffarE)CT(264) - GL_TRUE 

memqiy[Oxl 0623480, [binary data, size - 624 bytes], 624) 

g!UnmapBuffer(GL_ELEMENT ARRAY BUFFER) = GL.TRUE 

glGenBuffersd. [266]] 

g]NamedBufferDataEXT(2e6, 1608, NULL, GL_STATIC_DRAW) 
glMapNamedBufferEXT(266, GL_WRrTE_ONiy) = 0xi0623b40 
memcpy(0xl0623b40, (binary data. sze = l. 57031 kb], 1608] 
g1UninapNamedBufferEXT(256) - GL TRUE 
glGenBuffersd, [267]) 

glNamedBufferDataEXT(267. 6432, NULL CL STATIC DRAW) 
glMapNamedBufferE>a(267, GL_WRjTE_ONIY>"- 0x10624200 
meincpy(0xl0624200. [binary data, size = 6.28125 kb]. 6432] 
glUnmapNam«dBufrarE)CT(267} - GL TRUE 
glGenBuffersd, [268]) 

glBindBuffer(GL_ELEMEMT_ARRAV_BUFFER, 268) 

glBufferData(GL ELEMENT ARRAY BUFFER, 624, NULL. GL STATIC DRAW) 

9fMapNamedBuffertXT(266, GL.READ WRITE) - 0)d0623b40 
9lMapNamedBufferEXr[267, QL_HEAD_WRITE) = 0x10624200 

g1MapBufferRange(GL_ELEMENT_ARRAY_BUFFER, 0, 624, GL_MAP_READ_0IT | GL.MAP_WRITE_BIT) - 0x10623740 

memcpy(0xl0623b40, [binary data, size — 1.S7031 kb], 1608] 

glUntnapNamedBufferEXT(266) = CL TRUE 

memcpy(0xl0624200, [binary data, size = 6.28125 kb], 6432] 

glUnmapNamedBufrerE)a(267) - GL.TRUE 

memcpyCOxl 0623740, [binary data, size x± 624 bytes], 624] 

glUnmapBuffer(GL_ELEMENT_ARRAY_BUFFER) - GL_TRUE 

glGenBuffersd, [269]) 

glNamedBufferDataEXT(269. 4464, NULL, GL_STAT1C_DRAW) 
glMapNamedBufferEXT<269, GL_WRITE_ONIY] - 0x10626080 
memcpy[0>a0626080. [binary data, size - 4.35938 kb], 4464] 
g!UntnapNamedBufrerEXT(269) CL TRUE 
glGenBuffersd, [270]) 

glNamedBufferDa(aEXT(270, 17856, NULL, GL_STATIC_DRAW) 
glMapNamedBufferEXT(270, GL_WR[TE_ONiy> = 0x10627840 
memcpy(0xl0627840, [binary data, size - 17.4375 kb], 17856) 
glUnmapNamedBufrerEXT(270) = CL TRUE 
glGenBuffersd, [271]) 

g!BindBuffer(GL_ELEMENT ARRAY_BUFFER, 271) 

g!BufferOata(GL_ELEMENf ARRAY_BUFFER, 1872, NULL, GL_STAT1C_DRAW) 
glMapNamed8ufferEXT(269, GL_READ WRITE] = 0x10626080 
g]MapNamedBufferEXT<270, GL_READ"WRITE) - 0x10627840 

glMapBufferRange(GL_EL£MENT_ARHAY_BUFFER, 0, 1872, GL_MAP_READ BIT | GL_MAP WRITE_BIT) = 0xl062be40 

memcpy[0xl06:60B0. [binary data, size = 4.35938 kb], 4464] 

g1UnmapNamedBufferEXT(259) - GL TRUE 

memcpy(0xi0627840, [binary data, size = 17.4375 kb], 17856) 

glUnmapNamadBuffarE)CT(270) = GL TRUE 

memcpy(0xl062be40, [binary data, size - 1.82812 kb], 1872) 

glUnmapBuffer(GL_El£MENT_ARRAY_BUFFER) = GL_TRUE 

glGenBuffersd, [272]) 

glNamedBufferDataE)CT(272, 7620, NULL, GL_STAT1C_DRAW) 
glMapNamed8ufferE>a<272, GL_WRITE_ONLY) = 0xl062ca40 
memcpy[0xl062ca40, [binary data, size = 7.44141 kb], 7620] 
glUnmapNamedBufferEXT(272) - GL TRUE 
GlGenBuffersd. (2731) 
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Much better! 


. { cO- 1, d - 2}") 


gIBindSamplerO. 4) 
glBindSampler(2, 3) 
giBindSamplerd. 2) 
glBindSampler(0. 1) 

> glPushDebugGroup(GL DESUG_SOURCE APPLICATION. 4294967295, -1, update] 

glDebugMessagelnsert(GL DEBUG SOUBZE.APPUCATION, GL_DEBUG_TVPE_OTHER, 0, GL_DEBUG.SEVERITY_NOTIFICATION, -1, 'FBO cache h 
glBindFramebuffer(GL_FRAMEBUFFER, 1) 
glClearColor(0, 0, 0, 1) 

glCleaKGL COLOR BUFFER BIT) 
glUseProg~ram(l90) 

glBindMultiTextureE>CT(GL TEXTUREO, GL_TE)fTURE_2D, 281) 
glBindMultiTextureEXT(GL"TEXTUREl, GL TEXTURE_2D, 282) 
glBindMultiTextureE)Cr(GLlTE)CTURE2, GLItEXTURE_ 2D, 283) 
glOisable(GL_BLEND) 

> glPushDebugGroup(GL_OEBUG_SOURCE APPLICATION, 4294967295, -1, "virtual void ^ fl**W^ ' ■ — ■ ...)") 

glDebugMessagelnsert(GL DEBUG S0URCE_APPLICAT10N, GL DEBUG TYPE OTHER, 0, GL DEBUG_SEVERITy NOTIFICATION, -1, "FBO up to date") 
glViewport(0, 0, 1280, 720)* 
gIScissorlO, 0, 1280. 720) 

' gIPushDebugGroupIGL DESUG_SOURCE APPUCATION, 4294967295, -1. "virtual void F |if • ,,.)") 

- glPushDebugCroup(GL_DEBUG_SOUR:E_APPUCATION, 0, -1, 'Void f- IP | ■•,.,)") 

» glPushOebugGroup(GL_DEBUG_SOURCE_APPLICATiON, 0, -1, MOjOSHADER_glProgramReady) 
glOisableVertexAttribArTay(2) 
glUniform4fv(3, 1, ( 1 , 1 , 1 , 1 )) 
glPopDebugCroupO 

qlDebugMessagelnsert(GL_DEBUG_SOURCE_APPUCATION, GL_OEBUG_TYPE_OTHER, 0, GL_DEBUG_SEVERITV_NO'nFICATION, -1, "FBO up to date") 
giPopDebugGroupU 

glVertexAttribPointerlO, 2, 6L_FL0AT, GL_FALSE, 16, [binary data, size - 56 bytes!) 
glVertexAttiibPoInterU, 2, GL_FLOAT, GL_FALSE, 16, [binary data, size = 56 bytes)] 
glDrawArrays(GL.QUADS, 0, 4) 
gIPopDebugCroupO 

-glDebugMessagelns*rt{GL_DEBUG_SOURCE_APPLICATiON, GL_DEBUG_TYPE_OTHER, 0, GL_DEBUG_SEVERrTY_N0TIFICAT10N, -1, "FBO up to date") 

- glViewport(0, 0, 1280, 720] 
gIScissortO, 0, 1280, 720) 
glPopDebugCroupO 

> glPushDebugGroup(GL_OEBUG_SOURCE APPLICATION, 4294967295, -1, drawUI) 

glDebugMessagelnsert(GL DEBUG SOURCE APPLICATION. GL DEBUG TYPE OTHER, 0, GL DEBUG_SB/ERITy NOTIFICATION. -1. "FBO up to date") 
glViewport(0, 0. 1280, 720)' 
glScissor(0, 0, 1280. 720) 

' glPushDebugGroup(GL_DEBUG_SOURCE APPUCATION, 4294967295, -1. w 'r* r ~) 

glUseProgram(160) 
glEnable(CL_BLEND) 

glBlendFunc(GL_SRC_ALPFlA, GL_ONE_MINUS_SRC_ALPHA) 
glUseProgram(l63) 

glBindMultiTeirtureE>CT[GL_TE)(TUREO, GL_TEXTURE_2D, 22) 
glSamplerParameteria, GL_TE)aURE WRAP_S, GL_REPEAT) 
glSamplerParameteriU, GL_TEXTURE"WRAP_T, GL_REPEAT) 
glSamplerParameteria, GL_TE)aURElMiN_FiLTER, GL_UNEAR_MIPMAP_UNEAR) 
glSamplerParameteria, GL_TE)aURE MAG FILTER, GL_LINEAR) 

' gIPushDebugGroupIGL DE8UG_SOURCE_APPUCATION, 4294967295, -1, fillRect) 

- gIPushDebugGroupIGL DEBUG SOUHCE.APPLICATION, 4294967295, -1, Virtual void F | ^ — ,„)■) 

' glPushDebugCrouprGL_DEBUC_SOURCE_APPLICATION, 0, -1, "void ('■ F •...)’) 

» gIPushDebugGroupIGL DEBUG.SOURCE_APPUCATION, 0, -1, MOJOSHADER_glProgramReady) 

t glEnableVertexAttribArray{2) 

gluniformdfvll. 4, [0.0015625, 0, 0. -1, 0, -0.00277778, 0, ...1]) 

- glPopDebugCroupO 

glDebugMessagelnsert(GL_DEBUG_SOURCE_APPLICATON, GL_DEBUG_TVPE_OTHER, 0, GL_DEBU6_SEVERITY_N0TIFICAT0N, -1, "FBO up to date") 
gIPopDebugGroupO 

-giVertexAttribPoInterlO. 3. CL FLOAT, GL FALSE, 48. [binary data, size = 252 bytes)] 

glVertexAttrlbPointerU, 4. GL'FLOAT, GL'FALSE, 48, (binary data, size - 256 bytes]) 

glVertexAttrlbPointerl2, 3, GLIfloat, GL>ALSE, 48, [binary data, size = 252 bytes)) 

glDrawArrayslCL_TRiANGLES, 0, 6) 


glPopDebugCroupO 
glPopDebugCroupO 

> glPushDebugGroup(GL_DEBUG_SOURCE_APPLICATION, 4294967295, -] 

> gIPushDebugGroupIGL DEBUG_SOURCE_APPUCATION, 4294967295, ■] 

> glPushDebugCrouplGL"DEBUG_SOURCE_APPUCATION, 4294967295, -1 
‘ glPushDebugGroupIGL'DEBUG.SOUTCE.APPUCATION, 4294967295, -] 

> gIPushDebugGroupIGL DEBUG_SOURCE_APPUCATION, 4294967295, -] 

> glPushDebugGrouplGL~DEBUG_SOURCE_APPUCATION, 4294967295, 0 
» glPushDebugGrouplGL"DEBUG_SOURCE_APPUCATION, 4294967295, -] 

> glPushDebugGroup(GL_DEBUQ_SOURCE_APPUCATION, 4294967295, -] 

> gIPushDebugGroupIGL DEBUG_SOURCE_APPUCATION, 4294967295, ■] 
» glPushDebugGroup|GL"DEBUG_SOURCE_APPUCATION, 4294967295, 0 
' gIPushDebugGroupIGL DEBUG_SOURCE_APPUCATION, 4294967295, -] 

> gIPushDebugGroupIGL DEBUG.SOURCE.APPUCATION, 4294967295, -] 

> glPushDebugGrouplGL"DEBUG_SOURCE_APPUCATION, 4294967295, -1 


- fillRect) 
— - fiilRect] 
. . fillRect) 
■ - fillRect) 
‘a fillRect) 

fillRect) 

-••fillRect) 

- ■ fillRect] 
. fillRect) 
b fillRect) 

- fillRect) 
fillRect) 

- fillRect) 
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Call grouping 
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Stream 


EXT 

GREMEDY 
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_debug 

_string 
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Limited 
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Annotating the call stream (cont.) 

• All aforementioned extensions supported 
by apitrace regardless of driver 

• Recommended: GL_KHR_debug 
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Annotating the call stream (cont.) 

• Call grouping 

• glPushDebugGroupC )/glPopDebugGroup( ) 

• One-off messages 

• glDebugMessagelnsert [ARB] ( ) 

• glStringlVIarkerGREIVIEDYC ) 
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Object labelling 

• glObjectLabel( ) , glGetObjectLabel( ) 

• Buffer, shader, program, vertex array, query, program pipeline, 
transform feedback, sampler, texture, render buffer, frame 
buffer, display list 

• glObjectPtrLabel( ) , 
glGetObjectPtrLabel( ) 

• Sync objects 
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Annotation caveats 

• Multi-threaded grouping may break 
hierarchy 

• glDebugMessageInsert( ) calls the debug 

callback, polluting error streams 

• Workaround: drop if source == 

GL DEBUG SOURCE APPLICATION 
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Example 1: PIX events emulation 


#define D3DPERF_BeginEvent (colour , name) \ 

if (GLEW_KHR_debug && threadOwnsDevice( ) ) \ 

glPushDebugGroup(GL_DEBUG_SOURCE_APPLICATION,\ 

(GLuint)colour, -1, name) 

#define D3DPERF_EndEvent( ) \ 

if (GLEW_KHR_debug && threadOwnsDevice( ) ) \ 

glPopDebugGroupC ) 
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Example 2: Game tech demo 


• University assignment 
from 2009 © 

• Annotated OpenGL 1.4 

• Demo code: 
is.gd/GDCE14Linux 
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Takeaway 

• gcc-multilib is the prerequisite for 32/64- 
bit cross-compilation 

• Switching back and forth between Clang 
and gcc is easy and useful 

• Link times can be greatly improved by 
using gold 
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Takeaway (cont.) 

• Caching the gdb-index improves 
debugging experience 

• Crash handling is easy to do, tricky to get 
right 
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Takeaway (cont.) 

• Valghnd is an enormous aid in memory 
debugging 

• Even when employing custom allocators 

• OpenGL debugging experience can be 
vastly improved using some extensions 
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Questions? 


F7| lgodlewski@nordicgames.at 

@TheIneQuation 
^ inequation.org 
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Thank you! 

Further Nordic Games information 
^ www.nordicgames.at 
Development information: 

^ www.grimloregames.com 
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Bonus slides! 


• OpenGL resource leak checking 

• Intel 1965 driver vs stack 



Locating user data according to 
FreeDesktop.org guidelines 

Thread priorities in Linux 

Additional/new debug feature 
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OpenGL resource leak checking 

Courtesy of Eric Lengyel &. Fabian Giesen 

static void check_for_leaks ( ) 

{ 

GLuint max_id = 10000; // better idea would be to keep track of assigned names. 

GLuint id; 

// if brute force doesn't work, you're not applying it hard enough 
for ( id = 1 ; id <= max_id ; id++ ) 

{ 

#define CHECK( type ) if ( glls##type( id ) ) fprintf( stderr, "GLX: leaked " #type " handle 0x%x\n", (unsigned int) id ) 

CHECK( Texture ); 

CHECK( Buffer ); 

CHECK( Framebuffer ); 

CHECK( Renderbuffer ); 

CHECK( VertexArray ); 

CHECK( Shader ); 

CHECK( Program ); 

CHECK( ProgramPipeline ); 

#undef CHECK 


} 
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Intel 1965 vs stack 

• Been chasing a segfault on a call instruction 
down _mesa_Clear( ) (glClearO) 

• Region of code copy/pasted from DSD Tenderer 

• Address mapped, so not an invalid jump... 

• Only 16 function frames - surely this can't be 
a stack overflow? 
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Intel 1965 vs stack (cont.) 

• Oh no, wait: 

• Check ESP against /proc/ [pid] /maps 

• Yup, encroaching on unmapped address space 

• Moral: cut your render some stack slack 
(160+ kB), or Mesa will blow it up with 
locals (e.g. in clear shader generation) 
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Locating user data 

• There is a spec for that - see [XDGOl] 

• Savegames, screenshots, options etc.: 

• $XDG_CONFIG_HOME or ~/.config/<app> 

• Caches of all kinds: 

• $XDG_CACHE_HOME or ~/.cache/<app> 

• Per-user persistent data (e.g. DLC): 

• $XDG_DATA_HOME or ~/.local/share/<app> 
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Locating user data (cont.) 

• <app> subdirectory currently unregulated 

• De-facto standard: simplified or "Unix name" 

• Lowercase, "safe" ASCII characters, e.g. blender 

• When asked, XDG people suggest rev-DNS 

• com. company. appname 
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Thread priorities in Linux 

• Priority elevation requires root permissions ^ 

• No user will ever grant you root (scary!) 

• Reason: DoS protection in servers (probably) 

• Priority can be tweaked with nice() 

• Think "how nice the process is to others" 

• Being nice to everyone will starve your process 

• Niceness can be negative (but only with root) 
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Thread priorities in Linux (cont.) 

• Why not setpriority(2)? 

• Also sets scheduling algorithm here be dragons 

• Priority values have different meaning per scheduler 

• Still needs root 

• What about capabilities(7)? 

• This might actually work if your users trust you 

• Demo code: is.gd/GDCE14Linux 
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Thread priorities in Linux (cont.) 

• Don't all threads in a process share 
niceness? 

• They should, according to POSIX, but they 
don't! 

• One of the few cases where Linux is non- 
compliant 
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Additional/new debug features 

• Additional debug info: -g3 

• Including #defines (macros) 

• Better debugger performance [GNU02]: 

• -fdebug-types-section: improved layout 

• -gpubnames: new format for index 
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